文章速览 | 联邦学习 x CVPR'2023 (上)
本文是由白小鱼博主整理的CVPR 2023会议中,与联邦学习相关的论文合集及摘要翻译。
Confidence-Aware Personalized Federated Learning via Variational Expectation Maximization
Authors: Junyi Zhu; Xingchen Ma; Matthew B. Blaschko
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Zhu_Confidence-Aware_Personalized_Federated_Learning_via_Variational_Expectation_Maximization_CVPR_2023_paper.html
Abstract: Federated Learning (FL) is a distributed learning scheme to train a shared model across clients. One common and fundamental challenge in FL is that the sets of data across clients could be non-identically distributed and have different sizes. Personalized Federated Learning (PFL) attempts to solve this challenge via locally adapted models. In this work, we present a novel framework for PFL based on hierarchical Bayesian modeling and variational inference. A global model is introduced as a latent variable to augment the joint distribution of clients' parameters and capture the common trends of different clients, optimization is derived based on the principle of maximizing the marginal likelihood and conducted using variational expectation maximization. Our algorithm gives rise to a closed-form estimation of a confidence value which comprises the uncertainty of clients' parameters and local model deviations from the global model. The confidence value is used to weigh clients' parameters in the aggregation stage and adjust the regularization effect of the global model. We evaluate our method through extensive empirical studies on multiple datasets. Experimental results show that our approach obtains competitive results under mild heterogeneous circumstances while significantly outperforming state-of-the-art PFL frameworks in highly heterogeneous settings.
abstractTranslation: 联邦学习 (FL) 是一种分布式学习方案,用于训练跨客户端的共享模型。FL 中一个常见且基本的挑战是跨客户端的数据集可能分布不均且大小不同。个性化联邦学习 (PFL) 试图通过局部适应的模型来解决这一挑战。在这项工作中,我们提出了一种基于分层贝叶斯建模和变分推理的 PFL 新框架。引入全局模型作为潜在变量来增强客户参数的联合分布并捕获不同客户的共同趋势,基于最大化边际似然的原则导出优化并使用变分期望最大化进行优化。我们的算法产生了置信值的封闭形式估计,其中包括客户参数的不确定性和局部模型与全局模型的偏差。置信度值用于在聚合阶段权衡客户端的参数,调整全局模型的正则化效果。我们通过对多个数据集进行广泛的实证研究来评估我们的方法。实验结果表明,我们的方法在轻度异构环境下获得了有竞争力的结果,同时在高度异构环境中显着优于最先进的 PFL 框架。
Authors: Joshua C. Zhao; Ahmed Roushdy Elkordy; Atul Sharma; Yahya H. Ezzeldin; Salman Avestimehr; Saurabh Bagchi
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Zhao_The_Resource_Problem_of_Using_Linear_Layer_Leakage_Attack_in_CVPR_2023_paper.html
Abstract: Secure aggregation promises a heightened level of privacy in federated learning, maintaining that a server only has access to a decrypted aggregate update. Within this setting, linear layer leakage methods are the only data reconstruction attacks able to scale and achieve a high leakage rate regardless of the number of clients or batch size. This is done through increasing the size of an injected fully-connected (FC) layer. We show that this results in a resource overhead which grows larger with an increasing number of clients. We show that this resource overhead is caused by an incorrect perspective in all prior work that treats an attack on an aggregate update in the same way as an individual update with a larger batch size. Instead, by attacking the update from the perspective that aggregation is combining multiple individual updates, this allows the application of sparsity to alleviate resource overhead. We show that the use of sparsity can decrease the model size overhead by over 327x and the computation time by 3.34x compared to SOTA while maintaining equivalent total leakage rate, 77% even with 1000 clients in aggregation.
abstractTranslation: 安全聚合承诺在联邦学习中提高隐私级别,保持服务器只能访问解密的聚合更新。在这种情况下,线性层泄漏方法是唯一能够扩展并实现高泄漏率的数据重建攻击,无论客户端数量或批量大小如何。这是通过增加注入的全连接 (FC) 层的大小来完成的。我们表明,这会导致资源开销随着客户端数量的增加而增加。我们表明,此资源开销是由所有先前工作中的错误观点引起的,该观点将对聚合更新的攻击与具有更大批量的单个更新的攻击相同。相反,通过从聚合结合多个单独更新的角度来攻击更新,这允许应用稀疏性来减轻资源开销。我们表明,与 SOTA 相比,使用稀疏性可以将模型大小开销减少 327 倍以上,计算时间减少 3.34 倍,同时保持等效的总泄漏率,即使在聚合 1000 个客户端的情况下也为 77%。
Notes:
[pdf] (http://arxiv.org/abs/2303.14868)
Authors: Ruipeng Zhang; Qinwei Xu; Jiangchao Yao; Ya Zhang; Qi Tian; Yanfeng Wang
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Zhang_Federated_Domain_Generalization_With_Generalization_Adjustment_CVPR_2023_paper.html
Abstract: Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift. Recent exploration mainly focuses on designing an unbiased training strategy within each individual domain. However, without the support of multi-domain data jointly in the mini-batch training, almost all methods cannot guarantee the generalization under domain shift. To overcome this problem, we propose a novel global objective incorporating a new variance reduction regularizer to encourage fairness. A novel FL-friendly method named Generalization Adjustment (GA) is proposed to optimize the above objective by dynamically calibrating the aggregation weights. The theoretical analysis of GA demonstrates the possibility to achieve a tighter generalization bound with an explicit re-weighted aggregation, substituting the implicit multi-domain data sharing that is only applicable to the conventional DG settings. Besides, the proposed algorithm is generic and can be combined with any local client training-based methods. Extensive experiments on several benchmark datasets have shown the effectiveness of the proposed method, with consistent improvements over several FedDG algorithms when used in combination. The source code is released at https://github.com/MediaBrain-SJTU/FedDG-GA.
abstractTranslation: 联邦域泛化 (FedDG) 尝试以保护隐私的方式学习全局模型,该模型可以很好地泛化到可能发生域转移的新客户端。最近的探索主要集中在在每个单独的领域内设计一个无偏的训练策略。然而,如果没有多域数据在 mini-batch 训练中的联邦支持,几乎所有的方法都不能保证域转移下的泛化。为了克服这个问题,我们提出了一个新的全局目标,结合了一个新的方差减少正则化器来鼓励公平。提出了一种名为泛化调整 (GA) 的新型 FL 友好方法,通过动态校准聚合权重来优化上述目标。GA 的理论分析证明了通过显式重新加权聚合实现更严格的泛化边界的可能性,取代仅适用于传统 DG 设置的隐式多域数据共享。此外,所提出的算法是通用的,可以与任何基于本地客户端培训的方法相结合。在几个基准数据集上进行的大量实验表明了所提出方法的有效性,并且在组合使用时对几种 FedDG 算法进行了一致的改进。源代码发布在 [https://github.com/MediaBrain-SJTU/FedDG-GA。
Notes: https://github.com/MediaBrain-SJTU/FedDG-GA
Authors: Yuan-Yi Xu; Ci-Siang Lin; Yu-Chiang Frank Wang
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Xu_Bias-Eliminating_Augmentation_Learning_for_Debiased_Federated_Learning_CVPR_2023_paper.html
Abstract: Learning models trained on biased datasets tend to observe correlations between categorical and undesirable features, which result in degraded performances. Most existing debiased learning models are designed for centralized machine learning, which cannot be directly applied to distributed settings like federated learning (FL), which collects data at distinct clients with privacy preserved. To tackle the challenging task of debiased federated learning, we present a novel FL framework of Bias-Eliminating Augmentation Learning (FedBEAL), which learns to deploy Bias-Eliminating Augmenters (BEA) for producing client-specific bias-conflicting samples at each client. Since the bias types or attributes are not known in advance, a unique learning strategy is presented to jointly train BEA with the proposed FL framework. Extensive image classification experiments on datasets with various bias types confirm the effectiveness and applicability of our FedBEAL, which performs favorably against state-of-the-art debiasing and FL methods for debiased FL.
abstractTranslation: 在有偏见的数据集上训练的学习模型倾向于观察分类特征和不良特征之间的相关性,这会导致性能下降。大多数现有的去偏学习模型都是为集中式机器学习而设计的,不能直接应用于联邦学习 (FL) 等分布式设置,后者在保护隐私的情况下在不同的客户端收集数据。为了解决去偏联邦学习的挑战性任务,我们提出了一种新的 FL 框架,即消除偏差增强学习 (FedBEAL),它学习部署偏差消除增强器 (BEA),以便在每个客户端生成特定于客户端的偏差冲突样本。由于事先不知道偏差类型或属性,因此提出了一种独特的学习策略来联邦训练 BEA 和所提出的 FL 框架。对具有各种偏差类型的数据集进行的广泛图像分类实验证实了我们的 FedBEAL 的有效性和适用性,它比最先进的去偏差和去偏差 FL 的 FL 方法表现更好。
Authors: Yuanhao Xiong; Ruochen Wang; Minhao Cheng; Felix Yu; Cho-Jui Hsieh
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Xiong_FedDM_Iterative_Distribution_Matching_for_Communication-Efficient_Federated_Learning_CVPR_2023_paper.html
Abstract: Federated learning (FL) has recently attracted increasing attention from academia and industry, with the ultimate goal of achieving collaborative training under privacy and communication constraints. Existing iterative model averaging based FL algorithms require a large number of communication rounds to obtain a well-performed model due to extremely unbalanced and non-i.i.d data partitioning among different clients. Thus, we propose FedDM to build the global training objective from multiple local surrogate functions, which enables the server to gain a more global view of the loss landscape. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data through distribution matching. FedDM reduces communication rounds and improves model quality by transmitting more informative and smaller synthesized data compared with unwieldy model weights. We conduct extensive experiments on three image classification datasets, and results show that our method can outperform other FL counterparts in terms of efficiency and model performance. Moreover, we demonstrate that FedDM can be adapted to preserve differential privacy with Gaussian mechanism and train a better model under the same privacy budget.
abstractTranslation: 联邦学习 (FL) 最近越来越受到学术界和工业界的关注,其最终目标是在隐私和通信约束下实现协作训练。由于不同客户端之间极度不平衡和非 i.i.d 数据分区,现有基于迭代模型平均的 FL 算法需要大量的通信回合才能获得性能良好的模型。因此,我们建议 FedDM 从多个局部代理函数构建全局训练目标,这使服务器能够更全面地了解损失情况。详细地说,我们在每个客户端上构建合成数据集,以通过分布匹配在本地匹配原始数据的损失情况。与笨重的模型权重相比,FedDM 通过传输更多信息和更小的合成数据来减少通信次数并提高模型质量。我们对三个图像分类数据集进行了大量实验,结果表明我们的方法在效率和模型性能方面优于其他 FL 方法。此外,我们证明了 FedDM 可以适用于使用高斯机制来保护差异隐私,并在相同的隐私预算下训练出更好的模型。
Authors: Haozhao Wang; Yichen Li; Wenchao Xu; Ruixuan Li; Yufeng Zhan; Zhigang Zeng
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Wang_DaFKD_Domain-Aware_Federated_Knowledge_Distillation_CVPR_2023_paper.html
Abstract: Federated Distillation (FD) has recently attracted increasing attention for its efficiency in aggregating multiple diverse local models trained from statistically heterogeneous data of distributed clients. Existing FD methods generally treat these models equally by merely computing the average of their output soft predictions for some given input distillation sample, which does not take the diversity across all local models into account, thus leading to degraded performance of the aggregated model, especially when some local models learn little knowledge about the sample. In this paper, we propose a new perspective that treats the local data in each client as a specific domain and design a novel domain knowledge aware federated distillation method, dubbed DaFKD, that can discern the importance of each model to the distillation sample, and thus is able to optimize the ensemble of soft predictions from diverse models. Specifically, we employ a domain discriminator for each client, which is trained to identify the correlation factor between the sample and the corresponding domain. Then, to facilitate the training of the domain discriminator while saving communication costs, we propose sharing its partial parameters with the classification model. Extensive experiments on various datasets and settings show that the proposed method can improve the model accuracy by up to 6.02% compared to state-of-the-art baselines.
abstractTranslation: 联邦蒸馏(FD)最近因其聚合从分布式客户端的统计异构数据训练的多个不同本地模型的效率而受到越来越多的关注。现有的 FD 方法通常通过仅计算某些给定输入蒸馏样本的输出软预测的平均值来平等对待这些模型,这没有考虑所有局部模型的多样性,从而导致聚合模型的性能下降,尤其是当一些本地模型对样本知之甚少。在本文中,我们提出了一个新的视角,将每个客户端中的本地数据视为一个特定的领域,并设计了一种新的领域知识感知联邦蒸馏方法,称为 DaFKD,它可以辨别每个模型对蒸馏样本的重要性,从而能够优化来自不同模型的软预测的集合。具体来说,我们为每个客户端使用一个域鉴别器,它被训练来识别样本和相应域之间的相关因子。然后,为了在节省通信成本的同时促进领域鉴别器的训练,我们建议与分类模型共享其部分参数。在各种数据集和设置上进行的大量实验表明,与最先进的基线相比,所提出的方法可以将模型精度提高多达 6.02%。
Notes: https://github.com/haozhaowang/DaFKD2023
Authors: Yifan Shi; Yingqi Liu; Kang Wei; Li Shen; Xueqian Wang; Dacheng Tao
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Shi_Make_Landscape_Flatter_in_Differentially_Private_Federated_Learning_CVPR_2023_paper.html
Abstract: To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise. However, existing DPFL methods tend to make a sharper loss landscape and have poorer weight perturbation robustness, resulting in severe performance degradation. To alleviate these issues, we propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP. Specifically, DP-FedSAM integrates Sharpness Aware Minimization (SAM) optimizer to generate local flatness models with better stability and weight perturbation robustness, which results in the small norm of local updates and robustness to DP noise, thereby improving the performance. From the theoretical perspective, we analyze in detail how DP-FedSAM mitigates the performance degradation induced by DP. Meanwhile, we give rigorous privacy guarantees with Renyi DP and present the sensitivity analysis of local updates. At last, we empirically confirm that our algorithm achieves state-of-the-art (SOTA) performance compared with existing SOTA baselines in DPFL.
abstractTranslation: 为了防御推理攻击并减轻联邦学习 (FL) 中的敏感信息泄漏,客户端级差分私有 FL (DPFL) 是通过剪切本地更新和添加随机噪声来保护隐私的事实标准。然而,现有的 DPFL 方法往往会产生更尖锐的损失景观并且具有更差的权重扰动鲁棒性,从而导致严重的性能下降。为了缓解这些问题,我们提出了一种名为 DP-FedSAM 的新型 DPFL 算法,它利用梯度扰动来减轻 DP 的负面影响。具体来说,DP-FedSAM 集成了 Sharpness Aware Minimization (SAM) 优化器以生成具有更好稳定性和权重扰动鲁棒性的局部平坦度模型,从而导致局部更新的小范数和对 DP 噪声的鲁棒性,从而提高性能。从理论的角度,我们详细分析了 DP-FedSAM 如何减轻 DP 引起的性能下降。同时,我们通过Renyi DP给予严格的隐私保证,并给出了本地更新的敏感性分析。最后,我们根据经验证实,与 DPFL 中现有的 SOTA 基线相比,我们的算法实现了最先进的 (SOTA) 性能。
Notes:
[pdf] (http://arxiv.org/abs/2303.11242)
[code] (https://github.com/YMJS-Irfan/DP-FedSAM)
Authors: Zhe Qu; Xingyu Li; Xiao Han; Rui Duan; Chengchao Shen; Lixing Chen
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Qu_How_To_Prevent_the_Poor_Performance_Clients_for_Personalized_Federated_CVPR_2023_paper.html
Abstract: Personalized federated learning (pFL) collaboratively trains personalized models, which provides a customized model solution for individual clients in the presence of heterogeneous distributed local data. Although many recent studies have applied various algorithms to enhance personalization in pFL, they mainly focus on improving the performance from averaging or top perspective. However, part of the clients may fall into poor performance and are not clearly discussed. Therefore, how to prevent these poor clients should be considered critically. Intuitively, these poor clients may come from biased universal information shared with others. To address this issue, we propose a novel pFL strategy, called Personalize Locally, Generalize Universally (PLGU). PLGU generalizes the fine-grained universal information and moderates its biased performance by designing a Layer-Wised Sharpness Aware Minimization (LWSAM) algorithm while keeping the personalization local. Specifically, we embed our proposed PLGU strategy into two pFL schemes concluded in this paper: with/without a global model, and present the training procedures in detail. Through in-depth study, we show that the proposed PLGU strategy achieves competitive generalization bounds on both considered pFL schemes. Our extensive experimental results show that all the proposed PLGU based-algorithms achieve state-of-the-art performance.
abstractTranslation: 个性化联邦学习 (pFL) 协作训练个性化模型,在存在异构分布式本地数据的情况下为单个客户端提供定制的模型解决方案。尽管最近的许多研究已经应用各种算法来增强 pFL 中的个性化,但它们主要侧重于从平均或顶部角度提高性能。但是,部分客户可能会陷入绩效不佳,无法明确讨论。因此,应该认真考虑如何防止这些不良客户。直觉上,这些可怜的客户可能来自与他人共享的有偏见的普遍信息。为了解决这个问题,我们提出了一种新的 pFL 策略,称为局部个性化、通用化 (PLGU)。PLGU 通过设计分层清晰度感知最小化 (LWSAM) 算法来概括细粒度的通用信息并调节其偏差性能,同时保持个性化本地化。具体来说,我们将我们提出的 PLGU 策略嵌入到本文总结的两个 pFL 方案中:有/没有全局模型,并详细介绍了训练过程。通过深入研究,我们表明所提出的 PLGU 策略在两个考虑的 pFL 方案上都实现了有竞争力的泛化界限。我们广泛的实验结果表明,所有提出的基于 PLGU 的算法都达到了最先进的性能。
Authors: Zixuan Qin; Liu Yang; Qilong Wang; Yahong Han; Qinghua Hu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Qin_Reliable_and_Interpretable_Personalized_Federated_Learning_CVPR_2023_paper.html
Abstract: Federated learning can coordinate multiple users to participate in data training while ensuring data privacy. The collaboration of multiple agents allows for a natural connection between federated learning and collective intelligence. When there are large differences in data distribution among clients, it is crucial for federated learning to design a reliable client selection strategy and an interpretable client communication framework to better utilize group knowledge. Herein, a reliable personalized federated learning approach, termed RIPFL, is proposed and fully interpreted from the perspective of social learning. RIPFL reliably selects and divides the clients involved in training such that each client can use different amounts of social information and more effectively communicate with other clients. Simultaneously, the method effectively integrates personal information with the social information generated by the global model from the perspective of Bayesian decision rules and evidence theory, enabling individuals to grow better with the help of collective wisdom. An interpretable federated learning mind is well scalable, and the experimental results indicate that the proposed method has superior robustness and accuracy than other state-of-the-art federated learning algorithms.
abstractTranslation: 联邦学习可以协调多个用户参与数据训练,同时保证数据隐私。多个代理的协作允许联邦学习和集体智能之间的自然联系。当客户端之间的数据分布存在较大差异时,联邦学习设计可靠的客户端选择策略和可解释的客户端通信框架以更好地利用组知识至关重要。在此,从社会学习的角度提出并充分解释了一种可靠的个性化联邦学习方法,称为 RIPFL。RIPFL可靠地选择和划分参与培训的客户,使每个客户可以使用不同数量的社交信息,并更有效地与其他客户沟通。同时,该方法从贝叶斯决策规则和证据理论的角度,将个人信息与全局模型生成的社会信息进行有效整合,使个体借助集体智慧更好地成长。可解释的联邦学习思维具有良好的可扩展性,实验结果表明,与其他最先进的联邦学习算法相比,所提出的方法具有更高的鲁棒性和准确性。
Authors: Jiaxu Miao; Zongxin Yang; Leilei Fan; Yi Yang
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Miao_FedSeg_Class-Heterogeneous_Federated_Learning_for_Semantic_Segmentation_CVPR_2023_paper.html
Abstract: Federated Learning (FL) is a distributed learning paradigm that collaboratively learns a global model across multiple clients with data privacy-preserving. Although many FL algorithms have been proposed for classification tasks, few works focus on more challenging semantic seg-mentation tasks, especially in the class-heterogeneous FL situation. Compared with classification, the issues from heterogeneous FL for semantic segmentation are more severe: (1) Due to the non-IID distribution, different clients may contain inconsistent foreground-background classes, resulting in divergent local updates. (2) Class-heterogeneity for complex dense prediction tasks makes the local optimum of clients farther from the global optimum. In this work, we propose FedSeg, a basic federated learning approach for class-heterogeneous semantic segmentation. We first propose a simple but strong modified cross-entropy loss to correct the local optimization and address the foreground-background inconsistency problem. Based on it, we introduce pixel-level contrastive learning to enforce local pixel embeddings belonging to the global semantic space. Extensive experiments on four semantic segmentation benchmarks (Cityscapes, CamVID, PascalVOC and ADE20k) demonstrate the effectiveness of our FedSeg. We hope this work will attract more attention from the FL community to the challenging semantic segmentation federated learning.
abstractTranslation: 联邦学习 (FL) 是一种分布式学习范例,它可以跨多个客户端协作学习全局模型,同时保护数据隐私。尽管已经为分类任务提出了许多 FL 算法,但很少有工作关注更具挑战性的语义分割任务,尤其是在类异构 FL 情况下。与分类相比,异构 FL 用于语义分割的问题更为严重:(1)由于非 IID 分布,不同的客户端可能包含不一致的前景-背景类,导致局部更新不同。(2) 复杂密集预测任务的类异质性使得客户端的局部最优远离全局最优。在这项工作中,我们提出了 FedSeg,一种用于类异构语义分割的基本联邦学习方法。我们首先提出了一种简单但强大的修正交叉熵损失来纠正局部优化并解决前景-背景不一致问题。在此基础上,我们引入像素级对比学习来强制执行属于全局语义空间的局部像素嵌入。对四个语义分割基准(Cityscapes、CamVID、PascalVOC 和 ADE20k)的广泛实验证明了我们的 FedSeg 的有效性。我们希望这项工作能够吸引 FL 社区对具有挑战性的语义分割联邦学习的更多关注。
Authors: Kangyang Luo; Xiang Li; Yunshi Lan; Ming Gao
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Luo_GradMA_A_Gradient-Memory-Based_Accelerated_Federated_Learning_With_Alleviated_Catastrophic_Forgetting_CVPR_2023_paper.html
Abstract: Federated Learning (FL) has emerged as a de facto machine learning area and received rapid increasing research interests from the community. However, catastrophic forgetting caused by data heterogeneity and partial participation poses distinctive challenges for FL, which are detrimental to the performance. To tackle the problems, we propose a new FL approach (namely GradMA), which takes inspiration from continual learning to simultaneously correct the server-side and worker-side update directions as well as take full advantage of server's rich computing and memory resources. Furthermore, we elaborate a memory reduction strategy to enable GradMA to accommodate FL with a large scale of workers. We then analyze convergence of GradMA theoretically under the smooth non-convex setting and show that its convergence rate achieves a linear speed up w.r.t the increasing number of sampled active workers. At last, our extensive experiments on various image classification tasks show that GradMA achieves significant performance gains in accuracy and communication efficiency compared to SOTA baselines. We provide our code here: https://github.com/lkyddd/GradMA.
abstractTranslation: 联邦学习 (FL) 已成为事实上的机器学习领域,并受到社区快速增长的研究兴趣。然而,由数据异质性和部分参与引起的灾难性遗忘给 FL 带来了独特的挑战,这对性能是不利的。为了解决这些问题,我们提出了一种新的 FL 方法(即 GradMA),它从持续学习中汲取灵感,同时纠正服务器端和工作端更新方向,并充分利用服务器丰富的计算和内存资源。此外,我们详细阐述了一种内存减少策略,使 GradMA 能够适应具有大量 worker 的 FL。然后,我们从理论上分析了 GradMA 在平滑非凸设置下的收敛性,并表明其收敛速度实现了线性加速 w.r.t 随着采样活跃工人数量的增加。最后,我们对各种图像分类任务的广泛实验表明,与 SOTA 基线相比,GradMA 在准确性和通信效率方面取得了显着的性能提升。我们在这里提供我们的代码:https://github.com/lkyddd/GradMA
Notes:
[pdf] (http://arxiv.org/abs/2302.14307)
[code] (https://github.com/lkyddd/gradma)
Authors: Dongping Liao; Xitong Gao; Yiren Zhao; Cheng-Zhong Xu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Liao_Adaptive_Channel_Sparsity_for_Federated_Learning_Under_System_Heterogeneity_CVPR_2023_paper.html
Abstract: Owing to the non-i.i.d. nature of client data, channel neurons in federated-learned models may specialize to distinct features for different clients. Yet, existing channel-sparse federated learning (FL) algorithms prescribe fixed sparsity strategies for client models, and may thus prevent clients from training channel neurons collaboratively. To minimize the impact of sparsity on FL convergence, we propose Flado to improve the alignment of client model update trajectories by tailoring the sparsities of individual neurons in each client. Empirical results show that while other sparse methods are surprisingly impactful to convergence, Flado can not only attain the highest task accuracies with unlimited budget across a range of datasets, but also significantly reduce the amount of FLOPs required for training more than by 10x under the same communications budget, and push the Pareto frontier of communication/computation trade-off notably further than competing FL algorithms.
abstractTranslation: 由于非独立同分布。由于客户端数据的性质,联邦学习模型中的通道神经元可能专门针对不同客户端的不同特征。然而,现有的通道稀疏联邦学习 (FL) 算法为客户端模型规定了固定的稀疏策略,因此可能会阻止客户端协同训练通道神经元。为了最小化稀疏性对 FL 收敛的影响,我们建议 Flado 通过定制每个客户端中单个神经元的稀疏性来改进客户端模型更新轨迹的对齐。实证结果表明,虽然其他稀疏方法对收敛有惊人的影响,但 Flado 不仅可以在一系列数据集上以无限预算获得最高的任务准确度,而且可以显着减少训练所需的 FLOP 数量 10 倍以上通信预算,并将通信/计算权衡的帕累托前沿推得比竞争的 FL 算法更远。
作者: 白小鱼(上海交通大学计算机系博士生)
往期推荐
2.笔记分享|浙大暑期密码学课程:可证明安全基础深入浅出零知识证明(一):Schnorr协议4.课程报名丨山东大学网络空间安全学院隐私计算讲习暑期课程